<html>
<head>
<!--

    DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.

    Copyright (c) 2010-2017 Oracle and/or its affiliates. All rights reserved.

    The contents of this file are subject to the terms of either the GNU
    General Public License Version 2 only ("GPL") or the Common Development
    and Distribution License("CDDL") (collectively, the "License").  You
    may not use this file except in compliance with the License.  You can
    obtain a copy of the License at
    https://oss.oracle.com/licenses/CDDL+GPL-1.1
    or LICENSE.txt.  See the License for the specific
    language governing permissions and limitations under the License.

    When distributing the software, include this License Header Notice in each
    file and include the License file at LICENSE.txt.

    GPL Classpath Exception:
    Oracle designates this particular file as subject to the "Classpath"
    exception as provided by Oracle in the GPL Version 2 section of the License
    file that accompanied this code.

    Modifications:
    If applicable, add the following below the License Header, with the fields
    enclosed by brackets [] replaced by your own identifying information:
    "Portions Copyright [year] [name of copyright owner]"

    Contributor(s):
    If you wish your version of this file to be governed by only the CDDL or
    only the GPL Version 2, indicate your decision by adding "[Contributor]
    elects to include this software in this distribution under the [CDDL or GPL
    Version 2] license."  If you don't indicate a single choice of license, a
    recipient has the option to distribute your version of this file under
    either the CDDL, the GPL Version 2 or to extend the choice of license to
    its licensees as provided above.  However, if you add GPL Version 2 code
    and therefore, elected the GPL Version 2 license, then the option applies
    only if the new code is made subject to such option by the copyright
    holder.

-->

	<title>Design of XSOM</title>
	<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
	<style>
		pre {
			background-color: rgb(240,240,240);
			margin-left:	2em;
			margin-right: 2em;
			padding: 1em;
		}
		p {
			margin-left: 2em;
		}
		dt {
			margin-top: 0.5em;
			margin-left: 2em;
			font-weight: bold;
		}
		dd {
			margin-left: 3em;
		}
	</style>
</head>
<body>

<h1 style="text-align:center">Design of XSOM</h1>
<div align=right style="font-size:smaller">
By <a href="mailto:kohsuke.kawaguchi@sun.com">Kohsuke Kawaguchi</a><br>
</div>

<p>
	This document describes the details you need to know to extend/maintain XSOM.
</p>

<h1>Design Goals</h1>
<p>
	The primary design goals of XSOM are:
</p>
<ol>
	<li>Expose all the information defined in the schema spec
	<li>Provide additional methods that helps simplifying client applications.
</ol>
<p>
	Providing mutation methods was a non-goal for this project, primarily because of the added complexity.
</p>


<h1>Building workspace</h1>
<p>
	The workspace uses Ant as the build tool. The followings are the major targets:
</p>
<dl>
	<dt>clean</dt>
	<dd>remove the intermediate and output files.</dd>
	
	<dt>compile</dt>
	<dd>generate a parser by RelaxNGCC and compile all the source files into the bin directory.</dd>
	
	<dt>jar</dt>
	<dd>make a jar file</dd>
	
	<dt>release</dt>
	<dd>build a distribution zip file that contains everything from the source code to a binary file</dd>
	
	<dt>src-zip</dt>
	<dd>Build a zip file that contains the source code.</dd>
</dl>

<h1>Architecture</h1>
<p>
	XSOM consists of roughly three parts.
	
	The first part is the public interface, which is defined in the <code>com.sun.xml.xsom</code> package. The entire functionality of XSOM is exposed via this interface. This interface is derived from a draft document submitted to W3C by some WG members.
</p><p>
	The second part is the implementation of these interfaces, the <code>com.sun.xml.xsom.impl</code> package. These code are all hand-written.
</p><p>
	The third part is a parser that reads XML representation of XML Schema and builds XSOM nodes accordingly. The package is <code>com.sun.xml.xsom.parser</code>. This part of the code is mostly generated by <a href="http://relaxngcc.sourceforge.net/">RelaxNGCC</a>.
</p>
<center>
	<img src="architecture.png"/>
</center>




<h1>Implementation Details</h1>
<p>
	Most of the implementation classes are fairly simple. Probably the only one interesting piece of code is the <code>Ref</code> class, which is a reference to other schema components.
</p><p>
	The <code>Ref</code> class itself is just a place hodler and this class defined a series of inner interfaces that are specialized to hold a reference to different kinds of schema components.
	
	The sole purpose of this indirection layer is to support forward references during a parsing of the XML representation.
</p><p>
	A typical reference interface would look like this:
</p>
<pre>
public static interface Term {
    /** Obtains a reference as a term. */
    XSTerm getTerm();
}
</pre>
<p>
	In case this indirection is unnecessary, all implementation classes of <code>XSTerm</code> implements this <code>Ref.Term</code> interface. This applies to all the other types of the <code>Ref</code> interface. Therefore, whereever a reference is necessary, you can stimply pass a real object. In other words, a direct reference (<code>XS***Impl</code>) can be always treated as an indirect reference (<code>Ref.***</code>).
</p><p>
	Implementations for forward references are placed in the <code>com.sun.xml.xsom.impl.parser.DelayedRef</code> class. The detail will be discussed later.
</p>



<h1>Parser</h1>
<p>
	The following collaboration diagram shows various objects that participate in a parsing process.
</p>
<center>
	<img src="collaboration.png"/>
</center>
<p>
	<code>XSOMParser</code> is the only publicly visible component in this picture. This class also keeps references to vairous other objects that are necessary to parse schemas. This includes an error handler, the root <code>SchemaSet</code> object, an entity resolver, etc.
</p><p>
	Whenever the parse method is called, it will create a new NGCCRuntimeEx and configure XMLReader so that a schema file is parsed into this NGCCRuntimeEx instance.
	
	<code>NGCCRuntimeEx</code> derives from <code>NGCCRuntime</code>, which is a class generated by RelaxNGCC. This object will use other RelaxNGCC-generated classes and parse a document and constructs a XSOM object graph appropriately.
</p><p>
	When a new XML document is referenced by an import or include statement, a new set of <code>NGCCRuntimeEx</code> is set up to parse that document. One NGCCRuntimeEx can only parse one XML document.
</p>
<h2>Forward references and back-patching</h2>
<p>
	Since we use SAX to parse schemas, the referenced schema component is often unavailable when we hit a reference. Because of this, when we see a reference, we create a "delayed" reference that keeps the name of the referenced component.
</p><p>
	Note that because of the way XML Schema &lt;redefine> works, all the references by name must be lazily bound even if the component is already defined.
</p><p>
	All these "delayed" references are remembered and tracked by XSOMParser. When the client calls the <code>XSOMParser.getResult</code> method, XSOMParser will make sure that they resolve to a schema component correctly.
	"Delayed" references are available in the <code>DelayedRef</code> class.
</p>


<h2>RelaxNGCC</h2>
<p>
	The actual parser is generated by RelaxNGCC from <code>xsom/src/*.rng</code> files. <code>xmlschema.rng</code> is the entry point and all the other files are referenced from this file. For more information about RelaxNGCC, goto <a href="http://relaxngcc.sourceforge.net/">here</a>. Or just contact me (as I'm one of the developers of RelaxNGCC.)
</p>

</body>
</html>
